66 research outputs found

    SpaCEM3: a software for biological module detection when data is incomplete, high dimensional and dependent

    Get PDF
    Summary: Among classical methods for module detection, SpaCEM3 provides ad hoc algorithms that were shown to be particularly well adapted to specific features of biological data: high-dimensionality, interactions between components (genes) and integrated treatment of missingness in observations. The software, currently in its version 2.0, is developed in C++ and can be used either via command line or with the GUI under Linux and Windows environments. Availability: The SpaCEM3 software, a documentation and datasets are available from http://spacem3.gforge.inria.fr/. Contact: [email protected]; [email protected]

    Cav2.1 C‐terminal fragments produced in Xenopus laevis oocytes do not modify the channel expression and functional properties

    Get PDF
    International audienceThe sequence and genomic organization of the CACNA1A gene that encodes the Cav2.1 subunit of both P and Q type Ca2+ channels are well conserved in mammals. In human, rat and mouse CACNA1A, the use of an alternative acceptor site at the exon 46‐47 boundary results in the expression of a long Cav2.1 splice variant. In transfected cells, the long isoform of human Cav2.1 produces a C‐terminal fragment, but it is not known whether this fragment affects Cav2.1 expression or functional properties. Here, we cloned the long isoform of rat Cav2.1 (Cav2.1(e47)) and identified a novel variant with a shorter C‐terminus (Cav2.1(e47s)) that differs from those previously described in the rat and mouse. When expressed in Xenopus laevis oocytes, Cav2.1(e47) and Cav2.1(e47s) displayed similar functional properties as the short isoform (Cav2.1). We show that Cav2.1 isoforms produced short (CT1) and long (CT1(e47)) C‐terminal fragments that interacted in vivo with the auxiliary Cavβ4a subunit. Overexpression of the C‐terminal fragments did not affect Cav2.1 expression and functional properties. Furthermore, the functional properties of a Cav2.1 mutant without the C‐terminal Cavβ4 binding domain (Cav2.1ΔCT2) were similar to those of Cav2.1, and were not influenced by the co‐expression of the missing fragments (CT2 or CT2(e47)). Our results exclude a functional role of the C‐terminal fragments in Cav2.1 biophysical properties in an expression system widely used to study this channel

    REPRESENTAÇÕES DO BRASIL E DO BRASILEIRO NO DISCURSO DO JORNAL ESPANHOL EL PAÍS NO CONTEXTO PRÉ-COPA FIFA DE 2014

    Get PDF
    A produção de discursos da mídia internacional sobre o cenário político do Brasil esteve em efervescência nos últimos anos devido aos diversos megaeventos esportivos sediados no país, desde a Copa das Confederações, em 2013, e da Copa do Mundo, em 2014, até os Jogos Olímpicos, em 2016, os quais proporcionaram grande visibilidade ao país no exterior. Este trabalho objetiva analisar artigos de opinião do jornal espanhol El País, veiculados entre 2013 e 2014, a fim de observar as estruturas discursivas presentes quanto às representações e estereótipos do Brasil e do brasileiro no contexto das manifestações políticas daquele período. Para tanto, utilizaremos os pressupostos da Análise Crítica do Discurso (ACD), especialmente os trabalhos de Van Dijk (2008; 2012), em diálogo com estudiosos das Ciências Sociais (BOURDIEU, 1989), da Psicologia Social (MOSCOVICI, 1978; 2004), dos Estudos Culturais (HALL, 2000; 2011) e demais autores que contribuíram para a constituição de um panorama sócio-histórico do Brasil (ZWEIG, 2013; AGASSIZ, 1975; RAEDERS, 1988) etc., o que caracteriza um viés multidisciplinar a partir da Linguística. Encontramos na Análise Crítica do Discurso (ACD) nossa abordagem, uma vez que, enquanto perspectiva crítica que tem por finalidade principal o exame das relações entre discurso e poder, esta nos permitirá indagar uma possível polarização intergrupal entre europeus e brasileiros e a consequente (re) produção de representações desfavoráveis do Brasil e do brasileiro e a existência de uma relação hierarquizada entre culturas, a partir do discurso jornalístico espanhol

    Gene Regulatory Network Reconstruction Using Bayesian Networks, the Dantzig Selector, the Lasso and Their Meta-Analysis

    Get PDF
    Modern technologies and especially next generation sequencing facilities are giving a cheaper access to genotype and genomic data measured on the same sample at once. This creates an ideal situation for multifactorial experiments designed to infer gene regulatory networks. The fifth “Dialogue for Reverse Engineering Assessments and Methods” (DREAM5) challenges are aimed at assessing methods and associated algorithms devoted to the inference of biological networks. Challenge 3 on “Systems Genetics” proposed to infer causal gene regulatory networks from different genetical genomics data sets. We investigated a wide panel of methods ranging from Bayesian networks to penalised linear regressions to analyse such data, and proposed a simple yet very powerful meta-analysis, which combines these inference methods. We present results of the Challenge as well as more in-depth analysis of predicted networks in terms of structure and reliability. The developed meta-analysis was ranked first among the teams participating in Challenge 3A. It paves the way for future extensions of our inference method and more accurate gene network estimates in the context of genetical genomics

    Modèles markoviens graphiques pour la fusion de données individuelles et d'interactions : application à la classification de gènes

    No full text
    The research work presented in this dissertation is on keeping with the statistical integration of post-genomics data of heterogeneous kinds. Gene clustering aims at gathering genes of a living organism -modeled as a complex system- in meaningful groups according to experimental data to decipher the roles of the genes acting within biological mechanisms under study. We based our approach on probabilistic graphical models. More specifically, we used Hidden Markov Random Fields (HMRF) that allow us to simultaneously account for gene-individual features thanks to probability distributions and network data that translate our knowledge on existing interactions between these genes through a non oriented graph. Once the biological issues tackled are set, we describe the model we used as well as algorithmic strategies to deal with parameter estimation (namely mean field-like approximations). We then examine two specificities of the data we were faced to : the missing observation problem and the high dimensionality of this data. They lead to refinements of the model under consideration. Lastly, we present our experiments both on simulated and real Yeast data to assess the gain in using our method. In particular, our goal was to stress biologically plausible interpretations of our results.Les recherches que nous présentons dans ce mémoire s'inscrivent dans le cadre de l'intégration statistique de données post-génomiques hétérogènes. La classification non supervisée de gènes vise à regrouper en ensembles significatifs les gènes d'un organisme, vu comme un système complexe, conformément aux données expérimentales afin de dégager des actions concertées de ces gènes dans les mécanismes biologiques mis en jeu. Nous basons notre approche sur des modèles probabilistes graphiques. Plus spécifiquement, nous utilisons l'outil de champs de Markov cachés qui permet la prise en compte simultanée de données propres à chacun des gènes grâce a des distributions de probabilités et de données traduisant un réseau d'interaction au sein de l'organisme à l'aide d'un graphe non-orienté entre les gènes. Apres avoir présenté la problématique et le contexte biologique, nous décrivons le modèle utilisé ainsi que les stratégies algorithmiques d'estimation des paramètres (i.e. approximations de type champ moyen). Puis nous nous intéresserons à deux particularités des données auxquelles nous avons été confrontés et qui amènent des développements du modèle utilisé, notamment la prise en compte de l'absence de certaines observations et la haute dimensionnalité de celles-ci. Enfin nous présenterons des expériences sur données simulées ainsi que sur données réelles sur la levure qui évaluent le gain apporté par notre travail. Notamment, nous avons voulu mettre l'accent sur des interprétations biologiques plausibles des résultats obtenus

    Gene clustering via integrated Markov models combining individual and pairwise features.

    No full text
    International audienceClustering of genes into groups sharing common characteristics is a useful exploratory technique for a number of subsequent computational analysis. A wide range of clustering algorithms have been proposed in particular to analyze gene expression data, but most of them consider genes as independent entities or include relevant information on gene interactions in a suboptimal way. We propose a probabilistic model that has the advantage to account for individual data (e.g., expression) and pairwise data (e.g., interaction information coming from biological networks) simultaneously. Our model is based on hidden Markov random field models in which parametric probability distributions account for the distribution of individual data. Data on pairs, possibly reflecting distance or similarity measures between genes, are then included through a graph, where the nodes represent the genes, and the edges are weighted according to the available interaction information. As a probabilistic model, this model has many interesting theoretical features. In addition, preliminary experiments on simulated and real data show promising results and points out the gain in using such an approach. Availability: The software used in this work is written in C++ and is available with other supplementary material at http://mistis.inrialpes.fr/people/forbes/transparentia/supplementary.html

    A model-based approach to gene clustering with missing observation reconstruction in a Markov random field framework

    No full text
    The different measurement techniques that interrogate biological systems provide means for monitoring the behavior of virtually all cell components at different scales and from complementary angles. However, data generated in these experiments are difficult to interpret. A first difficulty arises from high-dimensionality and inherent noise of such data. Organizing them into meaningful groups is then highly desirable to improve our knowledge of biological mechanisms. A more accurate picture can be obtained when accounting for dependencies between components (e.g., genes) under study. A second difficulty arises from the fact that biological experiments often produce missing values. When it is not ignored, the latter issue has been solved by imputing the expression matrix prior to applying traditional analysis methods. Although helpful, this practice can lead to unsound results. We propose in this paper a statistical methodology that integrates individual dependencies in a missing data framework. More explicitly, we present a clustering algorithm dealing with incomplete data in a Hidden Markov Random Field context. This tackles the missing value issue in a probabilistic framework and still allows us to reconstruct missing observations a posteriori without imposing any pre-processing of the data. Experiments on synthetic data validate the gain in using our method, and analysis of real biological data shows its potential to extract biological knowledge

    Inferring large graphs using l1 -penalized likelihood

    No full text
    We address the issue of recovering the structure of large sparse directed acyclic graphs from noisy observations of the system. We propose a novel procedure based on a specific formulation of the 1-norm regularized maximumlikelihood, which decomposes the graph estimation into two optimization sub-problems: topological structure and node order learning. We provide convergence inequalities for the graph estimator, as well as an algorithm to solve the induced optimization problem, in the form of a convex program embedded in a genetic algorithm.We apply our method to various data sets (including data from the DREAM4 challenge) and show that it compares favorably to state-of-the-art methods. This algorithm is available onCRANas theRpackage GADAG
    corecore